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Abstract 



We provide a non-explicit separation of the number-on-forehead communication complexity classes 
RP and NP when the number of players is up to 5 • log n for any 5 < 1 . Recent lower bounds on Set- 
■ Disjointness 1.10. ,7 J provide an explicit separation between these classes when the number of players is 

i only up to o(loglog«). 



1 Introduction 



> 

O ', Iri the number-on-forehead (NOF) model of communication complexity, k players are trying to evaluate a 

^ ■ function F defined on kn bits. The input of F is partitioned into k pieces of n bits each, call them xi,... ,Xk, 

^ . and Xi is placed, metaphorically, on the forehead of player /. Thus, each player sees {k — l)n of the kn 

^ I input bits. The players communicate by writing bits on a shared blackboard in order to compute F. This 

Q ' model was introduced by ||5l and it has many applications, including circuit lower bounds OIHl, time/space 

. tradeoffs for Turing Machines, pseudo-random number generators for space-bounded Turing Machines IH, 

^ ' and proof system lower bounds [4]. 



In this model, a protocol is said to be "efficient" if it has complexity {logn)'^^^\ Correspondingly, P^'^, RP^"^, 
BPP^'^ and NP^' are the classes of functions having efficient deterministic, one-sided-error randomized, 
I (two-sided-error) randomized and nondeterministic protocols, respectively. The usual inclusions between 

these classes apply, so P^'' C RP^^ C NP^' and RP^'^ C BPP^'^. One of the most fundamental questions 
in NOF communication complexity is to provide separations between these classes. In [3], Beame et al. 
show that RP^' / P^'' for k < n^^^'^ players. Recendy, [7, 10] show that NP^^ ^ BPP^'^ (and thus, that 
NP^*^ 7^ RP^) fo'' ^ — "(log log «) players. Our main result in this paper is the following. 

Theorem 1.1 (Main Theorem). NPf ^ BPP^'' (and thus, HPl" 7^ RP^^j /or all 5 < I and all k < 5 ■ logn. 

Until very recently, it was far from clear how to obtain communication complexity lower bounds in the 
number-on-forehead model for any function that could separate nondeterministic from randomized com- 
plexity. The difficulty can be described as follows. The only method currently known for obtaining multi- 
party NOF lower bounds is the discrepancy method ||2l|T3l[8l. Lower bounds using discrepancy are obtained 
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by showing that the function in question has small discrepancy with respect to some distribution. Unfortu- 
nately, it is not hard to see that every function with small nondeterministic complexity has high discrepancy 
with respect to every distribution (see, for example. Lemma 3.1 in [7J.) Thus, the discrepancy method 
seemed doomed to failure and new techniques seemed to be required. 

However, in very recent work, these difficulties were overcome to obtain a surprisingly elegant lower bound 
for the Set-Disjointness function QlTOl. The idea behind their proofs as well as ours is as follows. 

In a recent paper, Sherstov [15] (and implicitly also in Razborov [14]) applied the discrepancy method in 
a more general way for the 2-player model in order to overcome the above difficulties. The generalized 
discrepancy method was adapted to the number-on-forehead model in fT. TOl and can be described at a high 
level as follows. Start with some candidate function F, where F has small nondeterministic complexity, and 
we want to prove that F has high randomized communication complexity. Now come up with a function 
G and a distribution A such that: (1) F and G are highly correlated with respect to A; and (2) G has small 
discrepancy with respect to A . It is not hard to see that if such a G can be found, then since G has small 
discrepancy, it requires large randomized complexity, and moreover since F and G are very correlated, this 
in turn implies lower bounds on the randomized complexity of F as well. 

Thus, to use the generalized discrepancy method, the problem is to come up with the functions F and G. To 
accomplish this, we will use another wonderful idea due to Sherstov |[T6l . and substantially generalized to 
apply to the number-on-forehead setting by Chattopadhyay [W]. We consider special functions of the form 
F'^. This will be a function on {k + \)n bits, computed hy k + I players. Player receives an «-bit vector 
X. Player /, for 1 < / < ^ gets an n-hit vector yi. The function (j) takes as input yi,...,yk and outputs an 
«-bit string z, where z has exactly m I's. We will view <p a selecting m bits/indices of Player O's input, x. 
The function will be the OR function applied to the m bits of x as specified by , . . .jjt). (In earlier 
terminology, the k-\-l players will apply the OR function to Player O's unmasked input.) 

Note that regardless of what function (j) is chosen, F'^ will have a small nondeterministic protocol. Player 
simply guesses an index j that is one of the indices chosen by <p , and then any of the other players can 
easily verify whether or not xj is 1 in that position. When <p is the bitwise AND function, then F'^ is the 
Set-Disjointness function. We will show that for almost all , the randomized communication complexity 
of F*^ is large as long as k is at most a constant times logn. Because we will be working with a random 0, 
as a bonus, our argument is substantially simpler that the previous bounds obtained for Set-Disjointness. 

2 Definitions and Notation 
2.1 Communication Complexity 

In the number-on-forehead (NOF) multiparty communication complexity game f5) there are k players that 
are trying to collaborate to compute a function F : Xi x . . . x Xk {0,1} where each X,- = {0, 1}". The kn 
input bits are partitioned into k sets, each of size n. For (xi , . . . € {0, 1}*^", and for each /, player / knows 
the values of all of the inputs except for x,- (which conceptually is thought of as being placed on player /'s 
forehead). 

The players exchange bits according to an agreed-upon protocol, by writing them on a public blackboard. 
A protocol specifies, for every possible blackboard contents, whether or not the communication is over, 
the output if over and the next player to speak if not. A protocol also specifies what each player writes as 
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a function of the blackboard contents and of the inputs seen by that player. The cost of a protocol is the 
maximum number of bits written on the blackboard. 



In a deterministic protocol, the blackboard is initially empty. A randomized protocol of cost c is simply a 
probability distribution over deterministic protocols of cost c, which can be viewed as a protocol in which 
the players have access to a shared random string. A non-deterministic protocol is one where an initial guess 
string appears on the blackboard at the beginning of the protocol, and the players are trying to verify that 
the output of the function is 1 in the usual sense: there exists a guess string where the output of the protocol 
is 1 if and only if the output of the function is 1 . 

The deterministic communication complexity of F, written Dk{F), is the minimum cost of a deterministic 
protocol for F that always outputs the correct answer. For < e < 1/2, let R^ e {F) denote the minimum cost 
of a randomized protocol for F which, for every input, makes an error with probability at most e (over the 
choice of the deterministic protocols). The (two-sided-error) randomized communication complexity ofF is 
Rk{F) = RkA/3{F)- Let Rl^{F) denote the minimum cost of a randomized protocol for F which is correct 
on all 0-inputs, and for every 1-input, it makes an error with probability at most £. The one-sided-error 
randomized communication complexity of F is RI{F) = Rj^ ^^^{F). The non-deterministic communication 
complexity ofF, written Nk{F), is the minimum cost of a non-deterministic protocol for F. We usually drop 
the subscript k when the number of players is clear from the context. 

Since any function F„ on kn bits can be computed using only n bits of communication, following fT"|, for 
sequences of functions F = protocols are considered "efficient" or "polynomial" if only polylog- 

arithmically many bits are exchanged. Accordingly, let P^'^, RP^'^, BPP^'^ and NP^'^ denote the classes of 
function families F for which Dk{F„) ,R\{Fn) ,Rk{Fn) and Nk{F„) are (logw)"^'^), respectively. 

Even though the standard communication complexity definitions above are given for functions with range 
{0, 1}, we find it more convenient to work with the range {—1,1}. We transform the former into the latter 
by mapping — > 1 (representing /a/^'e) and 1 — > — 1 (representing true). Thus, for example, when the range 
of F is {— 1, 1}, in a non-deterministic protocol the players are trying to verify that the output of F is -1. 

The most important method to prove lower bounds for randomized communication complexity uses the 
concept of discrepancy. An i-cylinder F, in Xi x . . . x Xu is a set such that for all xi e Xi , . . . ,Xi- G Xjt,x- G X,- 
we have {xi, . . . ,Xi,. . . ,Xk) G F, if and only if (xi , . . . ,x., . . . ,Xk) G F,-. A cylinder intersection is a set of the 
form nLi r,- where each F,- is an /-cylinder in Xi x • • • x X<;. For a set S, let I5 be its characteristic function, 
which is 1 if the input is in S and otherwise. Let A be a distribution on the inputs of F . The discrepancy 
of F on T under X is disc^;|^(F) = |Ej^;^ [F(x)lr(^)]|. The discrepancy of F under X is discj.j(^(F) = 
maxrdisc[^;L(F). The standard discrepancy method Q connects the discrepancy of a function F with its 

randomized communication complexity as follows: for every distribution X,Rk^e{F) > log ( ^J^^^f^x j . 



2.2 Notation 

Throughout this paper, the functions whose communication complexity we are analyzing are denoted by 
capital letters such as F. As mentioned in the introduction, we will be restricting our attention to certain 
functions which are constructed from a base function, usually denoted by lower case /, and a masking 
function, usually denoted by 0. In general, m denotes the size of the input to the base function /, and the 
range of this function is { — 1, 1}. A specific base function we will work with is the OR function, which 
takes on the value - 1 if and only if any of its input bits is 1 . The masking function <p takes as input k strings 
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of n bits each, usually denoted by ji, . . . ,yk, and it's output is an m-element subset of [l,n]. We always have 
m < n. Starting with a base function / and a masking function (j), we construct a function Lift(/, 0) on 
{k+l)n input bits as follows. Given «-bit inputs x,yi,... ,yk, <p is evaluated on the latter k inputs to select 
a set of m bits in x on which we apply /. Formally, Lift(/, ^){x,yi ,. .. ,yi^) = f{x\^{yi , . . . ,yk)), where 
for a set 5 C [1,m], xjS' denotes the substring of x indexed by the elements in 5. We are interested in the 
communication complexity of Lift(/, 0) in the NOF model with k + l players, where player gets x and 
players 1 through k get yi through y^, respectively. 

2.3 Correlation, Fourier Representation and Degree 

Let f,g : {0, 1}'" —>■ M. Let /i be a distribution on the set {0, 1}'". We define the correlation between f and 
g under jj. to be corr^(/,g) = Kxn^^[f{x)g{x)]. Whenever we omit to mention a specific distribution when 
computing the correlation, an expected value or a probability, it is to be assumed that we are talking about 
the uniform distribution. 

For S C [\,m], let Xs{x) = (-l)^'ei''' be the Fourier character of the set S. Let / : {0, 1}'" R and let 
fs = corr{f,Xs)- Then f{x) = Y,sc[i,m]fsXs{x) is the Fourier representation of /. The exact degree off is 
the size of the largest 5 such that fs is non-zero. The e-approximate degree off, denoted by degg(/) is the 
smallest d for which there exists a function g of exact degree d such that maxjc \f{x) — g{x)\ < £. 

2.4 Set Families 

Let § = (^i , . . . be a multi-set of ni-element subsets of [l,n]. Let the range of §, denoted by IJS, be the 
set of indices from [l,n] that appear in at least one set in S. Let the boundary of 8, denoted by 58, be the set 
of indices from [l,n] that appear in exactly one set in the collection 8. 

3 Statement of Results 

Our main technical result is the following. 

Theorem 3.1. Let 8 < \ be a constant. Let e = (1 — 5)/4. Let m = n^ and let k < 5 ■ logn. There exists a 
function such f/iaf (Lift(OR,0)) > n^^^\ 

Proof of Main Theorem \L1 \ from Theorem \3.1\ Consider the function ^ whose existence is guaranteed by 
TheoremO On the one hand, the Theorem implies that Lift(OR, 0) ^ BRP^'^j. 

On the other hand, the following is a nondeterministic protocol for Lift (OR, 0): guess an index / S [l,?i] 
using logn bits; player (the one holding x on its forehead) locally computes , . . . ,yk) and communi- 
cates a 1 if / belongs to that set; player 1 communicates a 1 if x, = 1. The cost of this protocol is C?(log?i). 
Easily, Lift{OR,^){x,yi , ... ,yk) = —1 iff there exists a guess / such that both players communicate a 1. 
Thus, Lift (OR, 0) G NP^^i. □ 
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4 Proof of Main Result 



We obtain our lower bounds on the bounded-error communication complexity of Lift(OR, 0) using an anal- 
ysis that follows [7]. In their paper, Chattopadhyay and Ada analyze the Set-Disjointness function, and for 
that reason, their masking function (p must be the AND function. In our case, intuitively, we allow ^ to 
be a random function. While our results no longer apply to Set-Disjointness, we still obtain a separation 
between BPPf and NP^' because, no matter what masking function is used, Lift(OR, 0) always has a cheap 
nondeterministic protocol. 

At a more technical level, the results of [7] become trivial when k > log log « because of the relationship 
between n (the size of the input to F) and m (the number of bits the base function OR gets applied to.) For 
their analysis to go through, they need n = 2^ nf"^^\ In our case, n = nf^^^ is sufficient, and this allows our 
results to be non-trivial for k < 5log?i for any 5 < 1. 

4.1 Overview of Proof 

As mentioned earlier, we will start with the base function / = OR on m input bits, m <n. We lift the base 
function / in order to obtain the lifted function F'^ = Lift(/, 0). Recall that is a function on {k + \)n 
inputs with small nondeterministic complexity, and is obtained by applying the base function (in this case 
the OR function) to the unmasked bits of Player O's input, x. We want to prove that for a random 0, F'^ has 
high randomized communication complexity. 

Paturi |fT2l proved that no function that is a sum of low-degree Fourier characters can well-approximate the 
OR function. This implies that there exists a function g (also on m bits) and a distribution jj. over all m-bit 
inputs such that the functions g and / = OR are highly correlated over /i and furthermore, g is orthogonal to 
all small Fourier characters. This is our Lemma l4~n and it was originally proved using duality by Sherstov 
|[T5l in the context of 2-player lower bounds for quantum communication complexity. 

Now we lift the function g in order to get the function = Lift(g, 0). Define A to be a distribution over all 
{k + l)?i-bit inputs that is the natural extension of /x. Since g and / = OR are highly correlated over /i, it is 
not hard to see (using the definitions and the fact that A is the natural extension of /i to the lifted space) that 
the lifted versions, F'^ and are also highly correlated over A. 

By the generalized discrepancy method (Lemma 1421 ) . in order to prove that the randomized complexity of 
F'^ is high, it suffices to prove that has small discrepancy. This final step is accomplished by Lemmas l44l 
14. 5[ and 14. 6[ using two important properties of g and ^. The crucial property of g that we exploit is that it 
is orthogonal to the space of all small Fourier characters. This property will be used to prove Lemma |44l 
Secondly, we want (p to behave like a random function with respect to all sub-cubes. This second property 
is exploited in order to prove Lemma 1431 We now proceed with the formal proof. 

4.2 Proof of Main Theorem 

The following lemma is from [15]. Intuitively it shows the following. Let / be a base function on m 
bits, and with the property that no function in the low-degree Fourier subspace can approximate /. (We 
will be interested in / = OR.) The lemma states that this implies the existence of another function g and 
a distribution jj. such that g is in the orthogonal subspace of low-degree Fourier characters and g well- 
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approximates /. 

Lemma 4.1 (Orthogonality Lemma, Lemma 5.1 in |7 |). If f : {0,1}'" { — 1,1} is a function with 5'- 
approximate degree d, there exist a function g : {0,1}'" { — 1,1} and a distribution jj. on {0,1}*" such 
that: 

(i) corr ^{gj) > 5'; and 

(ii) for every T C [l,m] with \T\ <d and every function h : {0, l}'^' ^ M, Ex^^[g{x) • /j(x|r)] = 0. 

The next lemma is the generalized discrepancy lemma from |7]. It states that if two functions F and G 
are highly correlated, and if G has small discrepancy (and hence high communication complexity), then the 
communication complexity of F is also high. 

Lemma 4.2 (Generalized Discrepancy Lemma, Lemma 3.2 in I7j). Let Z = Z\ x ■■■ x Z^. Let F,G . Z —>^ 
{—1,1} and let X be a distribution on Z such that con: i{G,F) > 5'. Then, for every e' < 5' /2, 



Rk,e'{F)> log 



d'-2-e' 



.disc^^;i(G), 

The following lemma is standard and used in every discrepancy argument. See |l2l[T3l|8l for details. 

Lemma 4.3 (The standard BNS argument). Let Z = X xYi x ■ ■ ■ xY^ and let F : Z ^ {—\,\}. Let T CZ 
be a cylinder intersection. We write yfor {yi, - ■ ■ ,yk)- Then, 



2* 

E,^y[F{x,y)\r{x,y)]] < 



E, 



n F{x,y1\...,yf] 

.HE {0,1}* 



Using the above lemmas. We will now prove Theorem 13. II By lfT2l . deg5/g(0R) > c^/m for some constant 
c. By Lemma I4TT1 applied with / = OR, there exist a function g and a distribution /i such that: 

(i) corr^(^,OR) > 5/6; and 

(ii) for every T C [l,m] with T < Cy/m and every function h : {0, l}'-''' M, K^^^ [g{x)h{x\T)] = 0. 

For every masking function 0, let F't' = Lift(OR,0) and let G't' = Lift(g,0). As in \7\, we define the 
distribution A on {0, l}(*^+i)" as follows. For x G {0, 1}" and 3^ = (ji , . . . ,yk) G {0, 1}*^", let 

^ix\m) 

It can be easily verified that corr;i(G'^,F'^) = corr^(g,OR) > 5/6. Thus, by LemmagH 

^ <liscj(G») J *V<iisCi(G*'); ^ ' 

Let F be the cylinder intersection that witnesses the discrepancy of G*^ under A. Then, 

disc;t(G^) =disc^(G^) = |E(,,y).;L[G^x,y)lr(x,y)]| =2'"|E,,,[Ai(x|(/)(3;))g(x|0(3;))lr(x,y)]| 
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where the last equaUty follows from the connection between X and the uniform distribution. Finally, by 
Lemma 1431 we obtain 



V0,(disc;t(G^))^ <2'"2'^E^, 



,"e{o,i}* 



It is at this point that we diverge from the analysis in Q. Let A = A (3^,3;^) be the event "3/ such that 
= y\" . Clearly, this event depends only on the choice of y° and . By a simple union bound, Pr^^o [A] < 
kjT = 2-«+'''«'^. Furthermore, Pr^^ f4] < 1, and since \\Lg\ < 1, yi [. . . |A] < 1. Thus, 



V0, (disC;L(G'^))^ < 2-''+'"2*+l°g*^ + 2'"2"E^ 



T-y 



n 

«£{0,1}* 



,yT))g{x\HyT,---,yT)) 



For the remaining part of the analysis, we fix the choices of 3^ and in such a way that the event A does 
not occur. For u G {0,1}\ define 5„ = 5„(y°,y^0) = 0(3;"' , . . . , ). Let 8 = §(3?°,y^0) be the multi-set 
(5„ : M G {0, 1}*^). Even though the sets Su and the multi-set § depend on y*',^' and <p, we will usually omit 
explicitly indicating this dependence in our proofs in order to reduce the clutter. We define the number of 
conflicts m S to be ^(S) = ml'^ — | (J §1- Intuitively, | U §1 rneasures the range of S, while ml'^ is the maximum 
possible value for this range. 

We use the following three Lemmas to complete our proof. 

Lemma 4.4. For every ,y^ and (j), ifA{'^,y^) and q{§>{'f,y^ ,(j))) < c • ^/m- 2*^/2, then 



n ^{x\S,{fy,^))g{x\Su{yP,y\(^)) 
."£{0.1}* 



0. 



Lemma 4.5. For every 'f,y^ and 0, ifA{^,y^), 

E.r 



n ^ix\SuCf,y\<t>)) 

ne{0,l}* 



< 



2^(S(y«,y',0)) 



Lemma 4.6. For every 3^,3^ , ifA{Y',y ), when is chosen at random, 



Pr[gi§if,y\^))=q\A{f,f)]< 



m-2* 



k\ 1 



Before proving these Lemmas, we complete the proof of our main Theorem. Since the bound on disc;^ {G^) 
holds for every , we can write 



Erf 



(disc;t(G^))' 



<2-"+'«2*+log<:^2'"2*E^.l 



E, 



n pi{ASu)g{x\su) 

ue{o.\Y 
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Moreover, 



n ^{x\su)g{x\su] 

«e{0,i}* 



< £Pr[^(S)=^|A]EyO 

q>Q f 



n l^{x\Su)8{x\Su) 

."£{0,1}* 



(by Lemmalllll < £ Pr[^(8) = ^|A]E^ ^ 
(because |g| = 1) < £ Pr[^(8) = ^|A]E^_y,,^ 

g>c^/m2''/2 

(by Lemma|431) < £ Pr[^(8) = 



E. 



E, 



n i^{x\su)gix\su) 

hg{0,1}* 



hg{0,1}* 



A,^(§) =^ 

A,^(S) =q 
A,q{§) =q 



(by LemmaSUl < £ ( 1 

g>Cv/m2*/2 V " / 



1 



/ 2m2 



2^2* 



^>cVm2*/2 



V 



We have chosen £ = (1 — 5)/4, so 1 — £ — 5 = 3£. Furthermore, m = and ^ < 5\ogn, so m2^ /n < 
„-i+e+5 ^ ^-3e ^ ^ ^4 ^j^gj^ ^ j^gg gnough. Thus, 2m2*^/n < 1 /2. Using Y.q>qa w'' = w'?" /(I - w) < 2w^" 
for w < 1/2, we obtain 



E, 



«e{o,i}* 



< 



2l-c-\/m2V2 
2;n2*' 



Putting everything together. 



E. 



(disc,(G^))' 



^ 2^«+'"2*+log/: _j_2'«2*2-'«2*2l-''V™2V2 



For the exponent of the first term, note that log^ < m2^ and n > Am2^, so —n + m2'' + \ogk < -2m2K When 
m is large enough, —2m2^ < — cy^2^/4. For the exponent of the second term, note that 1 < c^/ml'^ /4 when 
m is lai-ge enough, so 1 — c^/ml!^ 12 < —c^/m2'^ /A. Thus, the sum of the two terms is at most 2^^^^^ 
When m is large enough, so 



Erf 



(disc,(G'^))' 



Therefore, there exists some such that discAlG*^) < 2-"^'!'^. For this 0, 

1 



disc;t(G^) 



■0(1) > 0(1)^ = 0(1)«'^ >«' 
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5 Proofs of Lemmas 



Proof of Lemma W^ We write S^, for 5„(3^,y\0) and 8 for 8(3^,^^,0). Assume q{§) < c^fml^ jl. Let 
= I US| be the size of the range of 8, and let h{%) = \d§\ be the size of the boundary of 8. Note that 
r(S) — b{i>) < q{§) because every j G U8 \ dS occurs in at least 2 sets in 8, thus contributes at least 1 to 
<7(8). Furthermore, r(8) + <?(S) = Then, (8) > r(8) - g(8) = m2*^ -2<7(8) > {m — Cy/m)2 . There are 
2*^ sets in the multi-set 8 so by the pigeonhole principle, there exists v such that 15,. nd§\ > m — Cy/m. We 
can write 



n ^{x\Su)g{x\s,) 



E 



H{x\S,)g{x\S,)'E 



'x\[\,n\\Sy 



n ^l{x\s,)g{x\s,) 

.ne{0,l}*,H^i' 



Let r = 5v \ 58. So |r| < Cy/m. Let /z = IEx|[i,n]\s,, [ITu^vMC-'^I'^w),?!-'^!'^!*)] ■ Note that /j is a function that 
depends only on x\T . Then, by the property (ii) of g and }JL, 'E^\sAi^{A^v)g{A^v)h{x\T)] =0. □ 

Proof of Lemma \4~5\ We write 5„ for Su{^ ,<p) and 8 for 8(y°,j\ 0). We see that 



n ^(ASu) 

ue{o,i}* 



:E 



.r|[l,«]\US 



us 



n ^(ASu) 
.KG {0,1}* 



US 



n ^(ASu) 

«e{0,l}* 



Every u G {0, 1}*^ can be interpreted as an integer in the range [0,2*^ — 1]. With this in mind, for < j < 
2'^ — 1, let 8y be the sub-multi-set of 8 consisting of the sets up to and including Sj, 8y = (^o, . . . ,Sj). So, 
8 = 82A_i. Define S_i = 0. For < j < 2*^ - 1, let Gj = ^x\\j$j\n!i=Ql^{ASi)] and let Hj{x\Sj \ dSj) = 
^x\SjndSj ilJ-iASj)]- Letting G i = 1, observe that, for < j < 2*^ - 1, 



ll^ix\Si)]Hj{x\Sj\d§j) 



<{max{Hj))-Gj-i 



To obtain a bound on max(//y), consider an arbitrary partition of [l,m\ into two sets E,F. Let v be a dis- 
tribution on [l,m], and let p{x\E) = E^\f[v{x)]. Then, p(x|£') ='Lx\f^^^^^v{x) = 2"I'^I v(;t:) < 2"l^l = 
2\E\-m^ simply using the fact that v is a probability distribution. Thus, max{Hj) < 2l'^A'^Sj|-m inductively, 



2*-l 



U^iASi] 



;=0 



2l?=o'lsA5S,| 



Consider some index z e |J8. Suppose this index appears in I sets 5y, , . . . ,5^, from 8, with 71 < • • • < ji- 
Then, this index contributes exactly / — 1 to the expression Ly=o' l*^; \ ^Syl' once for every j = 72, • • • Ji 
(for i = i\, z ^ diij because no set before Sj contains z.) Since this holds for every index z, we see that 
L]=o \Sj \ d§j\ = q{S) and therefore Ex[n,e{o.ir l^iASu)] < 2'?(S)-'"2*. □ 



Proof of Lemma 14761 Fix ^ such that A. The multi-set 8 is constructed from the sets Su = <p{y"i ,yl'') 
for u G {0,1}'^. Since A did not occur, the l'^ points where ^ gets evaluated are distinct. Furthermore, 
(j) is chosen at random, which is equivalent to choosing 2^ random m-element subsets of [l,?i]. We can 
overestimate the number of conflicts in 8 as follows. Instead of choosing, for each subset, m elements 
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from [l,n] without replacement, suppose we chose them with replacement. The number of conflicts we will 
obtain can only be larger than in the original experiment or, equivalently, the probability of obtaining a fixed 
number of conflicts can only be greater in the second experiment. The maximum range of S is m2*^. Every 
conflict in § arises when we select a previously selected point from [l,n]. Thus, the probability of each 
conflict is independently at most m2*/n. The probability of obtaining q conflicts is at most (m2*/n)^. □ 
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